Latent Diffusion Bridges for Unsupervised Timbre Transfer DemoΒΆ
This demo page is for the paper Latent Diffusion Bridges for Unsupervised Timbre Transfer
Source code: link
Timbre Transfer ResultsΒΆ
Normal Instruments Created with Our MethodΒΆ
| Source | Target |
|---|---|
|
flute
|
violin
DPD: 0.07, JD: 0.0
|
|
trumpet
DPD: 0.05, JD: 0.0
|
|
|
violin
|
flute
DPD: 0.1, JD: 0.2
|
|
trumpet
DPD: 0.13, JD: 0.1
|
|
|
trumpet
|
flute
DPD: 0.02, JD: 0.0
|
|
violin
DPD: 0.02, JD: 0.0
|
|
|
bassoon
|
cello
DPD: 0.12, JD: 0.0
|
|
cello
|
bassoon
DPD: 0.07, JD: 0.0
|
Pitch-ShiftedΒΆ
| Source | Target |
|---|---|
|
flute shifted -20 semitones
|
bassoon
DPD: 0.21, JD: 0.0
|
|
flute shifted -25 semitones
|
bassoon
DPD: 0.6, JD: 0.25
|
Chunk-Based MinibatchΒΆ
| Source | Target |
|---|---|
|
flute
model trained with time chunk size 4 and channel chunk size 0
|
violin
DPD: 0.12, JD: 0.0
|
|
flute
model trained with time chunk size 4 and channel chunk size 32
|
violin
DPD: 0.2, JD: 0.0
|
|
violin
model trained with time chunk size 4 and channel chunk size 0
|
flute
DPD: 0.09, JD: 0.0
|
|
violin
model trained with time chunk size 4 and channel chunk size 32
|
flute
DPD: 0.13, JD: 0.1
|
Impact of Different Sigma Max and Sigma NΒΆ
| Source | Noise | Target |
|---|---|---|
|
violin
model with sigma_max=100 and sigma_N=100
|
Noisy violin
|
flute
DPD: 2.39, JD: 0.64
|
| Source | Noise | Target |
|---|---|---|
|
violin
model with sigma_max=100 and sigma_N=5
|
Noisy violin
|
flute
DPD: 0.12, JD: 0.1
|
Shared SpaceΒΆ
The following audio samples were generated using flute and violin models, both with sigma_max=100 and sigma_N=100, by sampling directly from N(0, sigma_max). Below, we provide examples of audio pairs that were considered melodically similar and those that were not.
| Flute | Violin |
|---|---|
|
Similar Melodies (DPD < 0.7)
|
DPD: 0.52, JD: 0.18
|
|
Different Melodies (DPD >= 0.7)
|
DPD: 1.77, JD: 0.25
|